Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM
نویسنده
چکیده
Summary: BWA-MEM is a new alignment algorithm for aligning sequence reads or assembly contigs against a large reference genome such as human. It automatically chooses between local and end-to-end alignments, supports paired-end reads and performs chimeric alignment. The algorithm is robust to sequencing errors and applicable to a wide range of sequence lengths from 70bp to a few megabases. For mapping 100bp sequences, BWA-MEM shows better performance than several state-of-art read aligners to date. Availability and implementation: BWA-MEM is implemented as a component of BWA, which is available at http://github.com/lh3/bwa. Contact: [email protected]
منابع مشابه
Alignment of 1000 Genomes Project reads to reference assembly GRCh38
The 1000 Genomes Project produced more than 100 trillion basepairs of short read sequence from more than 2600 samples in 26 populations over a period of five years. In its final phase, the project released over 85 million genotyped and phased variants on human reference genome assembly GRCh37. An updated reference assembly, GRCh38, was released in late 2013, but there was insufficient time for ...
متن کاملAligner optimization increases accuracy and decreases compute times in multi-species sequence data
As sequencing technologies have evolved, the tools to analyze these sequences have made similar advances. However, for multi-species samples, we observed important and adverse differences in alignment specificity and computation time for bwa- mem (Burrows-Wheeler aligner-maximum exact matches) relative to bwa-aln. Therefore, we sought to optimize bwa-mem for alignment of data from multi-species...
متن کاملCUSHAW3: Sensitive and Accurate Base-Space and Color-Space Short-Read Alignment with Hybrid Seeding
The majority of next-generation sequencing short-reads can be properly aligned by leading aligners at high speed. However, the alignment quality can still be further improved, since usually not all reads can be correctly aligned to large genomes, such as the human genome, even for simulated data. Moreover, even slight improvements in this area are important but challenging, and usually require ...
متن کاملSupporting Text
Genome Sequencing and Assembly. Initial shotgun libraries were generated and sequenced at the Broad by the Microbial Sequencing Center yielding 76,452 (PA2192) and 77,884 (C3719) sequences (paired-reads). The reads were assembled using ARACHNE (1, 2). After refinement, final assemblies contained 82 (PA2192) and 124 (C3719) contigs with a total sequence spanning single scaffolds of 6.83 Mb (PA21...
متن کاملLocal De Novo Assembly of RAD Paired-End Contigs Using Short Sequencing Reads
Despite the power of massively parallel sequencing platforms, a drawback is the short length of the sequence reads produced. We demonstrate that short reads can be locally assembled into longer contigs using paired-end sequencing of restriction-site associated DNA (RAD-PE) fragments. We use this RAD-PE contig approach to identify single nucleotide polymorphisms (SNPs) and determine haplotype st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013